Overview

Dataset statistics

Number of variables12
Number of observations8190
Missing cells24040
Missing cells (%)24.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory712.0 KiB
Average record size in memory89.0 B

Variable types

Numeric10
Categorical1
Boolean1

Dataset

DescriptionThis is a sample profiling report.
Creatormaryletteroa
Authormaryletteroa
URLwww.example.com

Alerts

Date has a high cardinality: 182 distinct values High cardinality
MarkDown1 is highly correlated with MarkDown4 and 1 other fieldsHigh correlation
MarkDown4 is highly correlated with MarkDown1High correlation
MarkDown5 is highly correlated with MarkDown1High correlation
MarkDown1 is highly correlated with MarkDown4High correlation
MarkDown4 is highly correlated with MarkDown1High correlation
MarkDown1 is highly correlated with MarkDown4High correlation
MarkDown4 is highly correlated with MarkDown1High correlation
Store is highly correlated with CPI and 1 other fieldsHigh correlation
Fuel_Price is highly correlated with UnemploymentHigh correlation
MarkDown1 is highly correlated with MarkDown4High correlation
MarkDown3 is highly correlated with IsHolidayHigh correlation
MarkDown4 is highly correlated with MarkDown1High correlation
CPI is highly correlated with Store and 1 other fieldsHigh correlation
Unemployment is highly correlated with Store and 2 other fieldsHigh correlation
IsHoliday is highly correlated with MarkDown3High correlation
MarkDown1 has 4158 (50.8%) missing values Missing
MarkDown2 has 5269 (64.3%) missing values Missing
MarkDown3 has 4577 (55.9%) missing values Missing
MarkDown4 has 4726 (57.7%) missing values Missing
MarkDown5 has 4140 (50.5%) missing values Missing
CPI has 585 (7.1%) missing values Missing
Unemployment has 585 (7.1%) missing values Missing
MarkDown5 is highly skewed (γ1 = 50.2778242) Skewed
Date is uniformly distributed Uniform

Reproduction

Analysis started2021-10-21 07:10:35.950239
Analysis finished2021-10-21 07:11:01.608342
Duration25.66 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

Store
Real number (ℝ≥0)

HIGH CORRELATION

Distinct45
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23
Minimum1
Maximum45
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.1 KiB
2021-10-21T15:11:01.721434image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q112
median23
Q334
95-th percentile43
Maximum45
Range44
Interquartile range (IQR)22

Descriptive statistics

Standard deviation12.9879661
Coefficient of variation (CV)0.5646941782
Kurtosis-1.201186459
Mean23
Median Absolute Deviation (MAD)11
Skewness0
Sum188370
Variance168.6872634
MonotonicityIncreasing
2021-10-21T15:11:01.954297image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
1182
 
2.2%
24182
 
2.2%
26182
 
2.2%
27182
 
2.2%
28182
 
2.2%
29182
 
2.2%
30182
 
2.2%
31182
 
2.2%
32182
 
2.2%
33182
 
2.2%
Other values (35)6370
77.8%
ValueCountFrequency (%)
1182
2.2%
2182
2.2%
3182
2.2%
4182
2.2%
5182
2.2%
6182
2.2%
7182
2.2%
8182
2.2%
9182
2.2%
10182
2.2%
ValueCountFrequency (%)
45182
2.2%
44182
2.2%
43182
2.2%
42182
2.2%
41182
2.2%
40182
2.2%
39182
2.2%
38182
2.2%
37182
2.2%
36182
2.2%

Date
Categorical

HIGH CARDINALITY
UNIFORM

Distinct182
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size64.1 KiB
05/02/2010
 
45
13/04/2012
 
45
27/04/2012
 
45
04/05/2012
 
45
11/05/2012
 
45
Other values (177)
7965 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row05/02/2010
2nd row12/02/2010
3rd row19/02/2010
4th row26/02/2010
5th row05/03/2010

Common Values

ValueCountFrequency (%)
05/02/201045
 
0.5%
13/04/201245
 
0.5%
27/04/201245
 
0.5%
04/05/201245
 
0.5%
11/05/201245
 
0.5%
18/05/201245
 
0.5%
25/05/201245
 
0.5%
01/06/201245
 
0.5%
08/06/201245
 
0.5%
15/06/201245
 
0.5%
Other values (172)7740
94.5%

Length

2021-10-21T15:11:02.146193image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
05/02/201045
 
0.5%
09/07/201045
 
0.5%
02/07/201045
 
0.5%
19/02/201045
 
0.5%
26/02/201045
 
0.5%
05/03/201045
 
0.5%
12/03/201045
 
0.5%
19/03/201045
 
0.5%
26/03/201045
 
0.5%
02/04/201045
 
0.5%
Other values (172)7740
94.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Temperature
Real number (ℝ)

Distinct4178
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean59.3561978
Minimum-7.29
Maximum101.95
Zeros0
Zeros (%)0.0%
Negative4
Negative (%)< 0.1%
Memory size64.1 KiB
2021-10-21T15:11:02.463455image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-7.29
5-th percentile26.849
Q145.9025
median60.71
Q373.88
95-th percentile87.131
Maximum101.95
Range109.24
Interquartile range (IQR)27.9775

Descriptive statistics

Standard deviation18.67860685
Coefficient of variation (CV)0.3146867141
Kurtosis-0.6108838043
Mean59.3561978
Median Absolute Deviation (MAD)13.995
Skewness-0.2833843522
Sum486127.26
Variance348.8903538
MonotonicityNot monotonic
2021-10-21T15:11:02.717983image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50.4311
 
0.1%
70.2811
 
0.1%
67.8710
 
0.1%
70.879
 
0.1%
76.039
 
0.1%
76.679
 
0.1%
72.629
 
0.1%
50.818
 
0.1%
53.598
 
0.1%
47.558
 
0.1%
Other values (4168)8098
98.9%
ValueCountFrequency (%)
-7.291
< 0.1%
-6.611
< 0.1%
-6.081
< 0.1%
-2.061
< 0.1%
0.251
< 0.1%
2.321
< 0.1%
2.451
< 0.1%
41
< 0.1%
5.541
< 0.1%
6.231
< 0.1%
ValueCountFrequency (%)
101.953
< 0.1%
100.141
 
< 0.1%
100.071
 
< 0.1%
99.662
< 0.1%
99.223
< 0.1%
99.21
 
< 0.1%
98.431
 
< 0.1%
98.151
 
< 0.1%
97.661
 
< 0.1%
97.61
 
< 0.1%

Fuel_Price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1011
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.405991819
Minimum2.472
Maximum4.468
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.1 KiB
2021-10-21T15:11:03.002356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum2.472
5-th percentile2.669
Q13.041
median3.513
Q33.743
95-th percentile4.021
Maximum4.468
Range1.996
Interquartile range (IQR)0.702

Descriptive statistics

Standard deviation0.4313365711
Coefficient of variation (CV)0.1266405188
Kurtosis-0.9523876532
Mean3.405991819
Median Absolute Deviation (MAD)0.298
Skewness-0.3050626486
Sum27895.073
Variance0.1860512376
MonotonicityNot monotonic
2021-10-21T15:11:03.285883image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.41743
 
0.5%
3.63843
 
0.5%
3.6340
 
0.5%
3.58339
 
0.5%
3.6237
 
0.5%
3.52431
 
0.4%
3.62231
 
0.4%
3.61130
 
0.4%
3.22730
 
0.4%
3.66630
 
0.4%
Other values (1001)7836
95.7%
ValueCountFrequency (%)
2.4721
 
< 0.1%
2.5131
 
< 0.1%
2.51414
0.2%
2.521
 
< 0.1%
2.5331
 
< 0.1%
2.5391
 
< 0.1%
2.542
 
< 0.1%
2.5421
 
< 0.1%
2.5451
 
< 0.1%
2.54814
0.2%
ValueCountFrequency (%)
4.4686
0.1%
4.4496
0.1%
4.3083
< 0.1%
4.3016
0.1%
4.2946
0.1%
4.2933
< 0.1%
4.2883
< 0.1%
4.2823
< 0.1%
4.2776
0.1%
4.2736
0.1%

MarkDown1
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4023
Distinct (%)99.8%
Missing4158
Missing (%)50.8%
Infinite0
Infinite (%)0.0%
Mean7032.371786
Minimum-2781.45
Maximum103184.98
Zeros0
Zeros (%)0.0%
Negative4
Negative (%)< 0.1%
Memory size64.1 KiB
2021-10-21T15:11:03.549510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-2781.45
5-th percentile109.416
Q11577.5325
median4743.58
Q38923.31
95-th percentile21500.9325
Maximum103184.98
Range105966.43
Interquartile range (IQR)7345.7775

Descriptive statistics

Standard deviation9262.747448
Coefficient of variation (CV)1.317158383
Kurtosis23.68716731
Mean7032.371786
Median Absolute Deviation (MAD)3569.965
Skewness4.016436305
Sum28354523.04
Variance85798490.28
MonotonicityNot monotonic
2021-10-21T15:11:03.837778image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
150.462
 
< 0.1%
6510.792
 
< 0.1%
4855.312
 
< 0.1%
8.622
 
< 0.1%
17.012
 
< 0.1%
175.642
 
< 0.1%
1.52
 
< 0.1%
2920.432
 
< 0.1%
460.732
 
< 0.1%
4557.941
 
< 0.1%
Other values (4013)4013
49.0%
(Missing)4158
50.8%
ValueCountFrequency (%)
-2781.451
< 0.1%
-772.211
< 0.1%
-563.91
< 0.1%
-16.931
< 0.1%
0.271
< 0.1%
0.51
< 0.1%
1.52
< 0.1%
1.941
< 0.1%
2.121
< 0.1%
2.141
< 0.1%
ValueCountFrequency (%)
103184.981
< 0.1%
95102.51
< 0.1%
88750.341
< 0.1%
88646.761
< 0.1%
84139.361
< 0.1%
80498.651
< 0.1%
78124.51
< 0.1%
77017.241
< 0.1%
75522.861
< 0.1%
75149.791
< 0.1%

MarkDown2
Real number (ℝ)

MISSING

Distinct2715
Distinct (%)92.9%
Missing5269
Missing (%)64.3%
Infinite0
Infinite (%)0.0%
Mean3384.176594
Minimum-265.76
Maximum104519.54
Zeros3
Zeros (%)< 0.1%
Negative25
Negative (%)0.3%
Memory size64.1 KiB
2021-10-21T15:11:04.072583image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-265.76
5-th percentile2.98
Q168.88
median364.57
Q32153.35
95-th percentile17261.44
Maximum104519.54
Range104785.3
Interquartile range (IQR)2084.47

Descriptive statistics

Standard deviation8793.583016
Coefficient of variation (CV)2.598440942
Kurtosis32.34218663
Mean3384.176594
Median Absolute Deviation (MAD)355.37
Skewness4.962258122
Sum9885179.83
Variance77327102.25
MonotonicityNot monotonic
2021-10-21T15:11:04.330040image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
311
 
0.1%
1.510
 
0.1%
0.59
 
0.1%
49
 
0.1%
0.038
 
0.1%
1.918
 
0.1%
67
 
0.1%
3.825
 
0.1%
0.015
 
0.1%
115
 
0.1%
Other values (2705)2844
34.7%
(Missing)5269
64.3%
ValueCountFrequency (%)
-265.761
< 0.1%
-1921
< 0.1%
-35.741
< 0.1%
-201
< 0.1%
-15.451
< 0.1%
-10.981
< 0.1%
-10.52
< 0.1%
-9.981
< 0.1%
-9.941
< 0.1%
-7.761
< 0.1%
ValueCountFrequency (%)
104519.541
< 0.1%
97740.991
< 0.1%
92523.941
< 0.1%
89121.941
< 0.1%
82881.161
< 0.1%
72413.711
< 0.1%
71074.171
< 0.1%
70574.851
< 0.1%
59362.31
< 0.1%
58804.911
< 0.1%

MarkDown3
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct2885
Distinct (%)79.9%
Missing4577
Missing (%)55.9%
Infinite0
Infinite (%)0.0%
Mean1760.10018
Minimum-179.26
Maximum149483.31
Zeros1
Zeros (%)< 0.1%
Negative13
Negative (%)0.2%
Memory size64.1 KiB
2021-10-21T15:11:04.594733image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-179.26
5-th percentile0.782
Q16.6
median36.26
Q3163.15
95-th percentile1159.758
Maximum149483.31
Range149662.57
Interquartile range (IQR)156.55

Descriptive statistics

Standard deviation11276.46221
Coefficient of variation (CV)6.406716127
Kurtosis72.06807509
Mean1760.10018
Median Absolute Deviation (MAD)34.16
Skewness8.133805548
Sum6359241.95
Variance127158599.9
MonotonicityNot monotonic
2021-10-21T15:11:04.828387image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
117
 
0.2%
315
 
0.2%
215
 
0.2%
614
 
0.2%
0.612
 
0.1%
411
 
0.1%
1.210
 
0.1%
0.249
 
0.1%
0.59
 
0.1%
0.39
 
0.1%
Other values (2875)3492
42.6%
(Missing)4577
55.9%
ValueCountFrequency (%)
-179.261
< 0.1%
-89.11
< 0.1%
-44.541
< 0.1%
-29.11
< 0.1%
-23.971
< 0.1%
-17.441
< 0.1%
-14.291
< 0.1%
-2.581
< 0.1%
-11
< 0.1%
-0.871
< 0.1%
ValueCountFrequency (%)
149483.311
< 0.1%
146394.441
< 0.1%
141630.611
< 0.1%
139621.511
< 0.1%
130129.111
< 0.1%
115048.811
< 0.1%
112255.671
< 0.1%
109976.141
< 0.1%
109030.751
< 0.1%
105691.671
< 0.1%

MarkDown4
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3405
Distinct (%)98.3%
Missing4726
Missing (%)57.7%
Infinite0
Infinite (%)0.0%
Mean3292.935886
Minimum0.22
Maximum67474.85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.1 KiB
2021-10-21T15:11:05.070051image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.22
5-th percentile18.4695
Q1304.6875
median1176.425
Q33310.0075
95-th percentile12863.771
Maximum67474.85
Range67474.63
Interquartile range (IQR)3005.32

Descriptive statistics

Standard deviation6792.329861
Coefficient of variation (CV)2.06269727
Kurtosis29.00029382
Mean3292.935886
Median Absolute Deviation (MAD)1070.015
Skewness4.864484796
Sum11406729.91
Variance46135744.95
MonotonicityNot monotonic
2021-10-21T15:11:05.263049image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35
 
0.1%
2.54
 
< 0.1%
24
 
< 0.1%
44
 
< 0.1%
2.614
 
< 0.1%
94
 
< 0.1%
3.973
 
< 0.1%
0.633
 
< 0.1%
123
 
< 0.1%
83
 
< 0.1%
Other values (3395)3427
41.8%
(Missing)4726
57.7%
ValueCountFrequency (%)
0.222
< 0.1%
0.411
 
< 0.1%
0.461
 
< 0.1%
0.633
< 0.1%
0.661
 
< 0.1%
0.782
< 0.1%
0.871
 
< 0.1%
0.921
 
< 0.1%
1.261
 
< 0.1%
1.471
 
< 0.1%
ValueCountFrequency (%)
67474.851
< 0.1%
65344.641
< 0.1%
63830.911
< 0.1%
63130.811
< 0.1%
60065.821
< 0.1%
57817.561
< 0.1%
57815.431
< 0.1%
56735.251
< 0.1%
56600.971
< 0.1%
53603.991
< 0.1%

MarkDown5
Real number (ℝ)

HIGH CORRELATION
MISSING
SKEWED

Distinct4045
Distinct (%)99.9%
Missing4140
Missing (%)50.5%
Infinite0
Infinite (%)0.0%
Mean4132.216422
Minimum-185.17
Maximum771448.1
Zeros0
Zeros (%)0.0%
Negative2
Negative (%)< 0.1%
Memory size64.1 KiB
2021-10-21T15:11:05.552038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-185.17
5-th percentile577.679
Q11440.8275
median2727.135
Q34832.555
95-th percentile10227.8585
Maximum771448.1
Range771633.27
Interquartile range (IQR)3391.7275

Descriptive statistics

Standard deviation13086.69028
Coefficient of variation (CV)3.16699053
Kurtosis2923.05653
Mean4132.216422
Median Absolute Deviation (MAD)1482.82
Skewness50.2778242
Sum16735476.51
Variance171261462.4
MonotonicityNot monotonic
2021-10-21T15:11:05.816850image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
986.232
 
< 0.1%
1327.972
 
< 0.1%
3113.782
 
< 0.1%
2743.182
 
< 0.1%
1064.562
 
< 0.1%
11604.371
 
< 0.1%
24475.381
 
< 0.1%
492.361
 
< 0.1%
12533.581
 
< 0.1%
6207.391
 
< 0.1%
Other values (4035)4035
49.3%
(Missing)4140
50.5%
ValueCountFrequency (%)
-185.171
< 0.1%
-37.021
< 0.1%
40.981
< 0.1%
60.921
< 0.1%
114.251
< 0.1%
134.471
< 0.1%
135.161
< 0.1%
142.751
< 0.1%
149.871
< 0.1%
153.041
< 0.1%
ValueCountFrequency (%)
771448.11
< 0.1%
108519.281
< 0.1%
105223.111
< 0.1%
85851.871
< 0.1%
63005.581
< 0.1%
58068.141
< 0.1%
57029.781
< 0.1%
53212.721
< 0.1%
45648.881
< 0.1%
45050.551
< 0.1%

CPI
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct2505
Distinct (%)32.9%
Missing585
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean172.4608092
Minimum126.064
Maximum228.9764563
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.1 KiB
2021-10-21T15:11:06.077934image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum126.064
5-th percentile126.5621
Q1132.3648387
median182.7640032
Q3213.9324122
95-th percentile223.8693849
Maximum228.9764563
Range102.9124563
Interquartile range (IQR)81.5675735

Descriptive statistics

Standard deviation39.7383461
Coefficient of variation (CV)0.2304195735
Kurtosis-1.832113304
Mean172.4608092
Median Absolute Deviation (MAD)42.0385282
Skewness0.06766805636
Sum1311564.454
Variance1579.136151
MonotonicityNot monotonic
2021-10-21T15:11:06.274743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
132.716096833
 
0.4%
139.122612924
 
0.3%
224.802531412
 
0.1%
201.070571212
 
0.1%
130.967096811
 
0.1%
130.384903211
 
0.1%
130.454620711
 
0.1%
130.550206911
 
0.1%
130.645793111
 
0.1%
130.741379311
 
0.1%
Other values (2495)7458
91.1%
(Missing)585
 
7.1%
ValueCountFrequency (%)
126.06411
0.1%
126.076645211
0.1%
126.085451611
0.1%
126.089290311
0.1%
126.101935511
0.1%
126.106903211
0.1%
126.111903211
0.1%
126.11411
0.1%
126.114580611
0.1%
126.126611
0.1%
ValueCountFrequency (%)
228.97645633
< 0.1%
228.88924821
 
< 0.1%
228.80204011
 
< 0.1%
228.77966823
< 0.1%
228.72986386
0.1%
228.7148321
 
< 0.1%
228.69264561
 
< 0.1%
228.64288822
 
< 0.1%
228.62762391
 
< 0.1%
228.6056231
 
< 0.1%

Unemployment
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct404
Distinct (%)5.3%
Missing585
Missing (%)7.1%
Infinite0
Infinite (%)0.0%
Mean7.826821039
Minimum3.684
Maximum14.313
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size64.1 KiB
2021-10-21T15:11:06.498223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum3.684
5-th percentile5.143
Q16.634
median7.806
Q38.567
95-th percentile10.926
Maximum14.313
Range10.629
Interquartile range (IQR)1.933

Descriptive statistics

Standard deviation1.877258594
Coefficient of variation (CV)0.2398494337
Kurtosis2.498221012
Mean7.826821039
Median Absolute Deviation (MAD)0.915
Skewness1.067685459
Sum59522.974
Variance3.524099828
MonotonicityNot monotonic
2021-10-21T15:11:06.676062image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.09978
 
1.0%
7.85256
 
0.7%
8.16356
 
0.7%
8.62554
 
0.7%
6.23752
 
0.6%
6.1752
 
0.6%
6.56552
 
0.6%
6.89152
 
0.6%
7.05752
 
0.6%
7.44152
 
0.6%
Other values (394)7049
86.1%
(Missing)585
 
7.1%
ValueCountFrequency (%)
3.6848
 
0.1%
3.87913
0.2%
3.8964
 
< 0.1%
3.92113
0.2%
3.93226
0.3%
4.07713
0.2%
4.12526
0.3%
4.14526
0.3%
4.15626
0.3%
4.26126
0.3%
ValueCountFrequency (%)
14.31342
0.5%
14.1839
0.5%
14.09939
0.5%
14.02136
0.4%
13.97524
0.3%
13.73639
0.5%
13.50342
0.5%
12.8939
0.5%
12.18739
0.5%
11.62739
0.5%

IsHoliday
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
False
7605 
True
 
585
ValueCountFrequency (%)
False7605
92.9%
True585
 
7.1%
2021-10-21T15:11:06.805991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Interactions

2021-10-21T15:10:58.450498image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:38.915215image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:41.973968image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:44.048608image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:46.101981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:48.116390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:50.011153image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:52.233785image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:54.363118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:56.521239image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:58.753772image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:39.286235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:42.239930image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:44.353595image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:46.373781image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:48.397714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:50.306068image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:52.540674image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:54.675439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:56.840129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:59.000149image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:39.574927image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:42.408816image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:44.541878image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:46.562186image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:48.554720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:50.500405image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:52.762009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:54.898700image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:57.013455image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:59.198622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:39.884685image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:42.591719image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:44.740852image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:46.751068image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:48.738010image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:50.711821image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:52.999602image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:55.226746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:57.189204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:59.385118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:40.289939image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:42.801665image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:44.961113image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:46.947875image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:48.919397image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:50.903031image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:53.212021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:55.411257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:57.378578image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:59.568986image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:40.579977image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:43.009825image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:45.138356image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:47.108063image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:49.096875image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:51.222840image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:53.395493image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:55.565533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:57.511575image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:59.748038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:40.887126image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:43.215006image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:45.336218image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:47.375738image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:49.293033image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:51.429631image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:53.575203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:55.776172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:57.710878image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:59.909542image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:41.153484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:43.394245image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:45.522393image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:47.571913image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:49.471444image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:51.645344image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:53.736995image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:55.982279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:57.909584image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:11:00.096410image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:41.431707image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:43.687094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:45.731050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:47.771162image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:49.656774image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:51.855771image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:53.961543image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:56.188650image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:58.102721image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:11:00.249120image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:41.700041image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:43.847292image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:45.933257image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:47.956275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:49.824993image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:52.057438image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:54.167130image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:56.360870image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-21T15:10:58.274013image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-10-21T15:11:06.883493image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-10-21T15:11:07.113001image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-10-21T15:11:07.344150image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-10-21T15:11:07.591987image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-10-21T15:11:00.559123image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-10-21T15:11:00.940991image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-10-21T15:11:01.230240image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-10-21T15:11:01.440084image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

StoreDateTemperatureFuel_PriceMarkDown1MarkDown2MarkDown3MarkDown4MarkDown5CPIUnemploymentIsHoliday
0105/02/201042.312.572NaNNaNNaNNaNNaN211.0963588.106False
1112/02/201038.512.548NaNNaNNaNNaNNaN211.2421708.106True
2119/02/201039.932.514NaNNaNNaNNaNNaN211.2891438.106False
3126/02/201046.632.561NaNNaNNaNNaNNaN211.3196438.106False
4105/03/201046.502.625NaNNaNNaNNaNNaN211.3501438.106False
5112/03/201057.792.667NaNNaNNaNNaNNaN211.3806438.106False
6119/03/201054.582.720NaNNaNNaNNaNNaN211.2156358.106False
7126/03/201051.452.732NaNNaNNaNNaNNaN211.0180428.106False
8102/04/201062.272.719NaNNaNNaNNaNNaN210.8204507.808False
9109/04/201065.862.770NaNNaNNaNNaNNaN210.6228577.808False

Last rows

StoreDateTemperatureFuel_PriceMarkDown1MarkDown2MarkDown3MarkDown4MarkDown5CPIUnemploymentIsHoliday
81804524/05/201367.113.6273249.34481.8258.481183.231309.30NaNNaNFalse
81814531/05/201365.883.6466474.49411.3877.069.384227.27NaNNaNFalse
81824507/06/201370.713.6339977.82744.2980.004825.713597.34NaNNaNFalse
81834514/06/201370.013.6322471.44517.87348.542612.333459.39NaNNaNFalse
81844521/06/201370.133.6264989.34385.31178.562463.423117.94NaNNaNFalse
81854528/06/201376.053.6394842.29975.033.002449.973169.69NaNNaNFalse
81864505/07/201377.503.6149090.482268.58582.745797.471514.93NaNNaNFalse
81874512/07/201379.373.6143789.941827.3185.72744.842150.36NaNNaNFalse
81884519/07/201382.843.7372961.491047.07204.19363.001059.46NaNNaNFalse
81894526/07/201376.063.804212.02851.732.0610.881864.57NaNNaNFalse